Estimation of relative effectiveness of phylogenetic programs by machine learning
نویسندگان
چکیده
Reconstruction of phylogeny of a protein family from a sequence alignment can produce results of different quality. Our goal is to predict the quality of phylogeny reconstruction basing on features that can be extracted from the input alignment. We used Fitch-Margoliash (FM) method of phylogeny reconstruction and random forest as a predictor. For training and testing the predictor, alignments of orthologous series (OS) were used, for which the result of phylogeny reconstruction can be evaluated by comparison with trees of corresponding organisms. Our results show that the quality of phylogeny reconstruction can be predicted with more than 80% precision. Also, we tried to predict which phylogeny reconstruction method, FM or UPGMA, is better for a particular alignment. With the used set of features, among alignments for which the obtained predictor predicts a better performance of UPGMA, 56% really give a better result with UPGMA. Taking into account that in our testing set only for 34% alignments UPGMA performs better, this result shows a principal possibility to predict the better phylogeny reconstruction method basing on features of a sequence alignment.
منابع مشابه
Machine learning algorithms in air quality modeling
Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...
متن کاملLearning-Based Energy Management System for Scheduling of Appliances inside Smart Homes
Improper designs of the demand response programs can lead to numerous problems such as customer dissatisfaction and lower participation in these programs. In this paper, a home energy management system is designed which schedules appliances of smart homes based on the user’s specific behavior to address these issues. Two types of demand response programs are proposed for each house which are sh...
متن کاملEvaluating machine learning methods and satellite images to estimate combined climatic indices
The reflections recorded on satellite images have been affected by various environmental factors. In these images, some of these factors are combined with other environmental factors that cannot be distinguished. Therefore, it seems wise to model these environmental phenomena in the form of hybrid indicators. In this regard, satellite imagery and machine learning methods can play a unique role ...
متن کاملInvestigating the performance of machine learning-based methods in classroom reverberation time estimation using neural networks (Research Article)
Classrooms, as one of the most important educational environments, play a major role in the learning and academic progress of students. reverberation time, as one of the most important acoustic parameters inside rooms, has a significant effect on sound quality. The inefficiency of classical formulas such as Sabin, caused this article to examine the use of machine learning methods as an alternat...
متن کاملBridging the semantic gap for software effort estimation by hierarchical feature selection techniques
Software project management is one of the significant activates in the software development process. Software Development Effort Estimation (SDEE) is a challenging task in the software project management. SDEE is an old activity in computer industry from 1940s and has been reviewed several times. A SDEE model is appropriate if it provides the accuracy and confidence simultaneously before softwa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of bioinformatics and computational biology
دوره 12 2 شماره
صفحات -
تاریخ انتشار 2014